Reducing Network Traffic and Managing Volatile Web Contents Using Migrating Crawlers with Table of Variable Information

نویسندگان

  • Niraj Singhal
  • Ashutosh Dixit
چکیده

As the size of the web continues to grow, searching it for useful information has become increasingly difficult. Also study reports that sufficient of current internet traffic and bandwidth consumption are due to the web crawlers that retrieve pages for indexing by the different search engines. Moreover, due to the dynamic nature of the web, it becomes very difficult for a search engine to provide fresh information to the user. An incremental crawler downloads modified contents only from the web for a search engine, thereby helps reducing the network load. This network load further can be reduced by using migrants. The migrants migrate to the web server for downloading, filtering and compressing the documents before transferring them to the search engine side. In this paper a more network efficient approach for extracting the volatile information from the web server using migrants with the help of table of volatile information has been developed, which further helps in reducing the network load significantly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Parallel Web Crawler based on Mobile Agent and Incremental Crawling

A huge amount of new information is placed on the Web every day. Large scale search engines frequently update their index gradually and are not capable to present such information in a timely behavior. An incremental crawler downloads customized contents only from the web for a search engine, thereby helps falling the network load. This network load farther will be reduced by using mobile agent...

متن کامل

Anomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism

Today, the use of the Internet and Internet sites has been an integrated part of the people’s lives, and most activities and important data are in the Internet websites. Thus, attempts to intrude into these websites have grown exponentially. Intrusion detection systems (IDS) of web attacks are an approach to protect users. But, these systems are suffering from such drawbacks as low accuracy in ...

متن کامل

Detecting Bot Networks Based On HTTP And TLS Traffic Analysis

Abstract— Bot networks are a serious threat to cyber security, whose destructive behavior affects network performance directly. Detecting of infected HTTP communications is a big challenge because infected HTTP connections are clearly merged with other types of HTTP traffic. Cybercriminals prefer to use the web as a communication environment to launch application layer attacks and secretly enga...

متن کامل

Need of Securing Migrating Crawling Agent, Remote Platform and the Data Collection

Using migrating (mobile) crawling agents, the process of selection and filtration of web documents can be done at web servers rather than search engine side, which reduces network load caused by the web crawlers. The mobile code from search engine side transfers and executes on web servers, an environment controlled by another party, it gives rise to several security issues in mobile agent comp...

متن کامل

Improving Tor security against timing and traffic analysis attacks with fair randomization

The Tor network is probably one of the most popular online anonymity systems in the world. It has been built based on the volunteer relays from all around the world. It has a strong scientific basis which is structured very well to work in low latency mode that makes it suitable for tasks such as web browsing. Despite the advantages, the low latency also makes Tor insecure against timing and tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013